Structural Feature Selection for Event Logs

نویسندگان

  • Markku Hinkka
  • Teemu Lehto
  • Keijo Heljanko
  • Alexander Jung
چکیده

We consider the problem of classifying business process instances based on structural features derived from event logs. The main motivation is to provide machine learning based techniques with quick response times for interactive computer assisted root cause analysis. In particular, we create structural features from process mining such as activity and transition occurrence counts, and ordering of activities to be evaluated as potential features for classification. We show that adding such structural features increases the amount of information thus potentially increasing classification accuracy. However, there is an inherent trade-off as using too many features leads to too long run-times for machine learning classification models. One way to improve the machine learning algorithms’ run-time is to only select a small number of features by a feature selection algorithm. However, the run-time required by the feature selection algorithm must also be taken into account. Also, the classification accuracy should not suffer too much from the feature selection. The main contributions of this paper are as follows: First, we propose and compare six different feature selection algorithms by means of an experimental setup comparing their classification accuracy and achievable response times. Second, we discuss the potential use of feature selection results for computer assisted root cause analysis as well as the properties of different types of structural features in the context of feature selection.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Concept drift detection in event logs using statistical information of variants

In recent years, business process management (BPM) has been highly regarded as an improvement in the efficiency and effectiveness of organizations. Extracting and analyzing information on business processes is an important part of this structure. But these processes are not sustainable over time and may change for a variety of reasons, such as the environment and human resources. These changes ...

متن کامل

Feature Selection in Structural Health Monitoring Big Data Using a Meta-Heuristic Optimization Algorithm

This paper focuses on the processing of structural health monitoring (SHM) big data. Extracted features of a  structure are reduced using an optimization algorithm to find a minimal subset of salient features by removing noisy, irrelevant and redundant data. The PSO-Harmony algorithm is introduced for feature selection to enhance the capability of the proposed method for processing the  measure...

متن کامل

Generation of a Set of Event Logs with Noise

Process mining is a relatively new research area aiming to extract process models from event logs of real systems. A lot of new approaches and algorithms are developed in this field. Researches and developers usually have a need to test end evaluate the newly constructed algorithms. In this paper we propose a new approach for generation of event logs. It serves to facilitate the process of eval...

متن کامل

مقایسه وبلاگ های کتابخانه ها و کتابداران ایرانی با وبلاگ های برتر کتابداری؛1385

Introduction: Web logs are the evident tools for the librarians. There are three main ways for applying web logs in librarianship fields, as follows: personal use by librarian to upgrade their personal information, as a source of information in case of libraries, and for their services. The aim of this research is to comparison between Iranian libraries and librarians, and superior librarianshi...

متن کامل

Process Trace Clustering: A Heterogeneous Information Network Approach

Process mining is the task of extracting information from event logs, such as ones generated from workflow management or enterprise resource planning systems, in order to discover models of the underlying processes, organizations, and products. As the event logs often contain a variety of process executions, the discovered models can be complex and difficult to comprehend. Trace clustering help...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017